Search CORE

149 research outputs found

Knowledge Unlearning for Mitigating Privacy Risks in Language Models

Author: Cha Sungmin
Jang Joel
Lee Moontae
Logeswaran Lajanugen
Seo Minjoon
Yang Sohee
Yoon Dongkeun
Publication venue
Publication date: 04/10/2022
Field of study

Pretrained Language Models (LMs) memorize a vast amount of knowledge during initial pretraining, including information that may violate the privacy of personal lives and identities. Previous work addressing privacy issues for language models has mostly focused on data preprocessing and differential privacy methods, both requiring re-training the underlying LM. We propose knowledge unlearning as an alternative method to reduce privacy risks for LMs post hoc. We show that simply applying the unlikelihood training objective to target token sequences is effective at forgetting them with little to no degradation of general language modeling performances; it sometimes even substantially improves the underlying LM with just a few iterations. We also find that sequential unlearning is better than trying to unlearn all the data at once and that unlearning is highly dependent on which kind of data (domain) is forgotten. By showing comparisons with a previous data preprocessing method known to mitigate privacy risks for LMs, we show that unlearning can give a stronger empirical privacy guarantee in scenarios where the data vulnerable to extraction attacks are known a priori while being orders of magnitude more computationally efficient. We release the code and dataset needed to replicate our results at https://github.com/joeljang/knowledge-unlearning

arXiv.org e-Print Archive

Contextualized Generative Retrieval

Author: Chang Hoyeon
Karpukhin Vlad
Kim Jaeyoung
Lee Hyunji
Lu Yi
Oh Hanseok
Seo Minjoon
Yang Sohee
Publication venue
Publication date: 07/10/2022
Field of study

The text retrieval task is mainly performed in two ways: the bi-encoder approach and the generative approach. The bi-encoder approach maps the document and query embeddings to common vector space and performs a nearest neighbor search. It stably shows high performance and efficiency across different domains but has an embedding space bottleneck as it interacts in L2 or inner product space. The generative retrieval model retrieves by generating a target sequence and overcomes the embedding space bottleneck by interacting in the parametric space. However, it fails to retrieve the information it has not seen during the training process as it depends solely on the information encoded in its own model parameters. To leverage the advantages of both approaches, we propose Contextualized Generative Retrieval model, which uses contextualized embeddings (output embeddings of a language model encoder) as vocab embeddings at the decoding step of generative retrieval. The model uses information encoded in both the non-parametric space of contextualized token embeddings and the parametric space of the generative retrieval model. Our approach of generative retrieval with contextualized vocab embeddings shows higher performance than generative retrieval with only vanilla vocab embeddings in the document retrieval task, an average of 6% higher performance in KILT (NQ, TQA) and 2X higher in NQ-320k, suggesting the benefits of using contextualized embedding in generative retrieval models

arXiv.org e-Print Archive

Nanosilver Colloids-Filled Photonic Crystal Arrays for Photoluminescence Enhancement

Author: B Yang
Dae-Geun Choi
DM Yeh
DW Lynch
EU Kim
HY Jung
Hyeong-Ho Park
J Lee
J Li
JH Sung
Ji-Hye Lee
JL Skinner
Jun-Ho Jeong
Jun-Hyuk Choi
JZ Zhang
K Ishihara
K Ray
KY Yang
KY Yang
M-K Kwon
MH Chowdhury
MK Sharma
P Reiss
PL Redmond
S-S Kim
Seong-Je Park
SH Jeon
Sohee Jeong
Soon-Won Lee
T Matsudhita
TWH Oates
X Zhou
Y Lei
YJ Lee
Publication venue: Springer
Publication date: 01/01/2010
Field of study

For the improved surface plasmon-coupled photoluminescence emission, a more accessible fabrication method of a controlled nanosilver pattern array was developed by effectively filling the predefined hole array with nanosilver colloid in a UV-curable resin via direct nanoimprinting. When applied to a glass substrate for light emittance with an oxide spacer layer on top of the nanosilver pattern, hybrid emission enhancements were produced from both the localized surface plasmon resonance-coupled emission enhancement and the guided light extraction from the photonic crystal array. When CdSe/ZnS nanocrystal quantum dots were deposited as an active emitter, a total photoluminescence intensity improvement of 84% was observed. This was attributed to contributions from both the silver nanoparticle filling and the nanoimprinted photonic crystal array

Crossref

Springer - Publisher Connector

PubMed Central

Allogeneic Blood Transfusion Given Before Radiotherapy Is Associated with the Poor Clinical Outcome in Patients with Cervical Cancer

Author: Bilgin
Birgegård
Blajchman
Blajchman
Clarke
Clifton
Demaria
Dunst
Fyles
Günter
Hill
Joo-Young Kim
Jung-Hyun Yoon
Kaminski
Koukourakis
Kwok
Lavey
Marchal
Murphy
Myong Cheol Lim
Nagai
Obermair
Okuno
Petersen
Quigley
Sang Yoon Park
Sang-Soo Seo
Santin
Sohee Park
Sokbom Kang
Sun-Young Kong
Tae-Hyun Kim
Tefferi
Varlotto
Vaupel
Wun
Yamamoto
Yang
Zips
zur Hausen
Publication venue: Yonsei University College of Medicine
Publication date: 01/01/2008
Field of study

Crossref

PubMed Central

Clinical characteristics and mortality of patients with hematologic malignancies and COVID-19: a systematic review

Author: Ghayda Ramy A.
Hong Sung Hwi
Jacob Louis
Kim G. E.
Kim Jae Seok
Kim Sohee
Koyanagi Ai
Kronbichler Andreas
Lee Keum Hwa
Li Han
Shin Jae Il
Smith Lee
Yang Jae Won
Publication venue: Verduci Editore
Publication date: 01/11/2020
Field of study

OBJECTIVE: Hematologic cancer patients with Coronavirus Disease 2019 (COVID-19) tend to have a more serious disease course than observed in the general population. Herein, we comprehensively reviewed existing literature and analyzed clinical characteristics and mortality of patients with hematologic malignancies and COVID-19. MATERIALS AND METHODS: Through searching PubMed until June 03, 2020, we identified 16 relevant case studies (33 cases) from a total of 45 studies that have reported on patients with COVID-19 and hematologic malignancies. We investigated the clinical and laboratory characteristics including type of hematologic malignancies, initial symptoms, laboratory findings, and clinical outcomes. Then, we compared those characteristics and outcomes of patients with hematologic malignancies and COVID-19 to the general population infected with COVID-19. RESULTS: The median age was 66-year-old. Chronic lymphocytic leukemia was the most common type of hematologic malignancy (39.4%). Fever was the most common symptom (75.9%). Most patients had normal leukocyte counts (55.6%), lymphocytosis (45.4%), and normal platelet counts (68.8%). In comparison to patients with COVID-19 without underlying hematologic malignancies, dyspnea was more prevalent (45.0 vs. 24.9%, p=0.025). Leukocytosis (38.9 vs. 9.8%, p=0.001), lymphocytosis (45.4 vs. 8.2%, p=0.001), and thrombocytopenia (31.3 vs. 11.4%, p=0.036) were significantly more prevalent and lymphopenia (18.2 vs. 57.4%, p=0.012) less prevalent in patients with hematologic malignancies. There were no clinical and laboratory characteristics predicting mortality in patients with hematologic malignancies. Mortality was much higher in patients with hematologic malignancies compared to those without this condition (40.0 vs. 3.6%, p<0.001). CONCLUSIONS: Co-occurrence of hematologic malignancies and COVID-19 is rare. However, due to the high mortality rate from COVID-19 in this vulnerable population, further investigation on tailored treatment and management is required

Yonsei University Medical Library Open Access Repository

Anglia Ruskin Research